Method and system of unifying data

ABSTRACT

A system, method and database design is provided for unifying data from a plurality of heterogeneous databases, each having business-context related data and a data access mechanism. A database is created (e.g., the UniDimNet) which contains a node for each dimension of an industry. For each data source which is accessible via the system, a set of data source specific dimensions is created and mapped to the corresponding industry business context dimension(s). A set of templates (e.g., UniViews) is created to query the data sources. Each UniView contains a specific question for a specific dimension designed for a specific data source. UniViews query the database they are associated with by using the data access mechanism of the associated database. A central server coordinates the system and facilitates use of the system through an interface (e.g., the UniViewer). The UniViewer allows a user to query the data sources by identifying an industry business context dimension, a dimension instance and at least one UniView. Multiple UniViews can be combined, cached and saved to facilitate complex queries.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. ProvisionalApplication No. 60/398,841 filed Jul. 26, 2002.

TECHNICAL FIELD OF THE INVENTION

[0002] The present invention relates to the database management arts. Itfinds particular application as a method and system for accessingheterogeneous data in a plurality of databases.

BACKGROUND OF THE INVENTION

[0003] Most modern businesses employ a multitude of different softwareapplications which collect and store huge amounts of data. Theseapplications store data in a multitude of different ways (e.g., a flatfile, a relational database or an object database) and in a formatspecific to the application in question (e.g., a binary file format).Access to this data is achieved through a variety of differingmechanisms (e.g., a direct database query, various reporting tools, orvia an application program interface (an “API”)), each offeringdiffering levels of flexibility, ease of use and coherence to thebusiness concepts underlying the data. This data is generallyheterogeneous but also generally related in terms of the underlyingbusiness concepts which drive and populate the databases.

[0004] It is generally desirable for such a business to pull togetherall of this data and access it centrally, with data from differentapplications merging together to afford views which span the multipleand disparate applications and which further offer overall picturesrelating to the underlying business concepts. Such an aggregation isdifficult, however, as each data source may have its data stored in aunique, individual way on some form of persistent hardware which is notnecessarily compatible with the ways and hardware of other data sources.Furthermore, the actual format of the data is usually very complex andnot directly business-relevant, which exacerbates the difficulty inaggregating, or unifying, the data in the multiple data sources.

[0005] To attempt to alleviate these difficulties, it is desirable foreach data source to offer some form of data access mechanism (e.g., aset of pre-built queries or a set of API's) to facilitate access toitself. These mechanisms offer access to sets or subsets of informationwhich are more abstract and meaningful in a business sense than the rawdata stored in each data source. For example, a pre-built query mayquery a data base to retrieve all information relating to a particularindividual and return only information relevant to such an individual.While such data access mechanisms are beneficial for obtainingbusiness-relevant information from a single database (or at least asingle type of database), such data access mechanisms are specific tothe application which relates to the database. The design and technicalspecifications for any particular data access mechanism generallydiffers significantly from that of another particular data accessmechanism. As such, a data access mechanism for a particular database isnot likely useable for a different database, let alone for each of themany disparate databases which make up the information store of a modernbusiness. Such access mechanisms are not readily adaptable to facilitateviews which span multiple databases and which further facilitate overallpictures relating to underlying business concepts.

[0006] The need exists, therefore, to provide a method and system forunifying data from a plurality of heterogeneous databases, each havingbusiness-context related data and a data access mechanism.

SUMMARY OF THE INVENTION

[0007] In accordance with one embodiment of the present invention, asystem for unifying data relating to an industry having a plurality ofindustry business context dimensions is provided. The system includes aplurality of data sources, each having data which is capable of groupinginto at least one data source specific dimension, and at least onehaving a physical or logical structure differing from at least oneother. The system further includes a database having a first and asecond plurality of nodes, each of the first plurality representing anindustry business context dimension, and each of the second pluralityrepresenting a data source specific dimension. The system still furtherincludes a plurality of data source query function calls, each callquerying a single data source regarding a single data source specificdimension.

[0008] In accordance with another embodiment of the present invention, asystem for managing data in a plurality of data sources is provided. Thesystem includes a UniDimNet and a plurality of UniViews. The UniDimNetincludes a plurality of UniDims representing industry business contextdimensions and a plurality of DataSourceDims representing data sourcespecific dimensions of each data source. Each UniDim is related to atleast one other UniDim, and each DataSourceDim is related to at leastone UniDim. UniViews may be combined into complex queries. A complexquery may have a set of input parameters, which do not includeidentification of a particular data source.

[0009] In accordance with yet another embodiment of the presentinvention, a method for managing data in a plurality of data sources isprovided. The method includes the steps of identifying a plurality ofindustry business context dimensions, identifying at least one datasource specific dimension for each data source, providing a UniDimNet,providing a plurality of UniViews, formulating a complex query andproviding the results of the query.

[0010] In accordance with still another embodiment of the presentinvention, a method for querying data in a plurality of data sources isprovided. The method includes the steps of receiving a dimension to bequeried, providing a plurality of data source query function calls toselect from, creating a result set including columns defined by theselected function calls, receiving the identity of at least onedimension instance to query, and populating the columns with the resultsof the query.

[0011] An advantage of the present invention is that a plurality of datasources containing heterogeneous data may be unified and queried. Afurther advantage of the present invention is that a plurality of datasources having differing logical or physical structures can be queriedby a single system using the data access mechanisms which are providedfor each data source. Still a further advantage of the present inventionis that complex queries for the data sources can be created, modifiedand carried out across the complete set of data sources. An additionaladvantage is that related data within multiple data sources can bequeried without having to identify each particular data source in thequery.

[0012] These and other aspects and advantages of the present inventionwill be apparent to those skilled in the art from the followingdescription of the preferred embodiments in view of the accompanyingdrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0013] In the accompanying drawings which are incorporated in andconstitute a part of the specification, embodiments of the invention areillustrated, which, together with a general description of the inventiongiven above, and the detailed description given below, serve to examplethe principles of this invention.

[0014]FIG. 1 is an exemplary overall system diagram of a system forunifying data in accordance with one embodiment of the presentinvention;

[0015]FIG. 2 is an exemplary diagram of a database in accordance withone embodiment of the present invention;

[0016]FIG. 3 is an exemplary diagram of a data structure in accordancewith one embodiment of the present invention;

[0017]FIG. 4 is an exemplary diagram of a dimensional data structure inaccordance with one embodiment of the present invention;

[0018] FIGS. 5A-5C are exemplary diagrams of data source data structuresin accordance with one embodiment of the present invention;

[0019] FIGS. 6A-6C are exemplary diagrams of data structurerelationships in accordance with one embodiment of the presentinvention;

[0020]FIG. 7 is an exemplary diagram of data structure relationshipsamong multiple data sources in accordance with one embodiment of thepresent invention;

[0021]FIG. 8 is an exemplary diagram of tables and table relationshipsin accordance with one embodiment of the present invention;

[0022]FIG. 9 is an exemplary diagram of a UniView relationship inaccordance with one embodiment of the present invention;

[0023]FIG. 10 is an exemplary diagram of a UniViewInterface inaccordance with one embodiment of the present invention;

[0024]FIG. 11 is an exemplary diagram of a CompoundUniView in accordancewith one embodiment of the present invention;

[0025]FIG. 12 is an exemplary diagram of a UniBuilder in accordance withone embodiment of the present invention;

[0026]FIG. 13 is an exemplary diagram of a UniServer in accordance withone embodiment of the present invention;

[0027]FIG. 14 is an exemplary flowchart of data retrieval method inaccordance with one embodiment of the present invention;

[0028]FIG. 15 is an exemplary diagram of a notifier in accordance withone embodiment of the present invention;

[0029] FIGS. 16-19 are exemplary screen shots of a UniViewer interfacein accordance with one embodiment of the present invention;

[0030] FIGS. 20A-20B is an exemplary flowchart of a method to add a datasource in accordance with one embodiment of the present invention;

[0031]FIG. 21 is an exemplary flowchart of a method of unifying data inaccordance with one embodiment of the present invention; and

[0032]FIG. 22 is an exemplary flowchart of steps of a method of unifyingdata in accordance with one embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

[0033] The following includes definitions of exemplary terms usedthroughout the disclosure. For illustrative purposes only, and not tolimit the disclosure of the invention set forth herein, an exemplaryindustry and an exemplary group of databases will be used herein toillustrate examples of certain embodiments of the present invention. Theexemplary industry is the pharmaceuticals industry, and particularly thepharmaceuticals industry as such relates to clinical trials and clinicaltrial evaluations of certain drugs on certain patients.

[0034] In the pharmaceuticals industry, a sponsor is an entity (e.g., adrug or medical device company) which is desirous of having a drug (orother medical device) tested for, inter alia, FDA approval. Such testsare conducted as one or more clinical trials, or studies. Each studytypically incorporates one or more sites (e.g., a particular hospital ordoctors' practice group) at which one or more patients uses the drug ona trial basis. Records are kept of the patient and the patient's use ofthe drug, including any symptoms. Such records are oftentimes capturedelectronically via an electronic data capture (“EDC”) suite of softwareapplications and stored in associated database(es). Although the presentexample is described in terms of clinical trials and the pharmaceuticalsindustry, those skilled in the art will readily appreciate that theinvention will find application in any industry which uses multipleheterogeneous databases as set forth herein.

[0035] In the following definitions of exemplary terms, both singularand plural forms of all terms fall within each meaning. Except wherenoted otherwise, capitalized and non-capitalized forms of all terms fallwithin each meaning:

[0036] As used herein, “data” is used generically and includes but isnot limited to information in a form suitable for processing by acomputer. Except where noted otherwise, “data” is information (includingoperational and legacy) which is contained or capable of being containedin a data source (as defined below). In the pharmaceuticals example,“data,” includes but is not limited to individual patient informationsuch as height, weight, sex and age; study information such asparticular EDC response(se); and study information such as the identityof the drug.

[0037] As used herein, “data source” is used generically and includesbut is not limited to a database and/or software application whichprovides and/or stores data. In the pharmaceuticals example, a “datasource” is a database which contains information relating to a sponsor,site, study, patient or any other entity related to the pharmaceuticalsindustry.

[0038] As used herein, “data source instance” is a particularinstallation of a data source. In the pharmaceuticals example, a “datasource instance” is a real installation of a data source accessed by asystem of the present invention. For example, a database containinginformation for a study which was not previously accessed by a system ofthe present invention would be considered a new “data source instance”to the system.

[0039] As used herein, “data access mechanism” is used interchangeablywith “data retrieval mechanism” and is used generically, including butnot limited to a software application or module of a softwareapplication which facilitates access to and retrieval of data from adata source. Typically a data access mechanism is specific to thedatabase and/or software application (or database type or softwareapplication type) to which it is related, being customized to accesssame in response to a query. Exemplary data access mechanisms include,but are not limited to, pre-built database queries, pre-built databaseviews and sets of application program interfaces (API's).

[0040] As used herein, a “dimension” is a specific logical conceptwithin an industry. In the pharmaceuticals example, “dimensions” includebut are not limited to “sponsor,” “study,” “site” and “patient,” each ofwhich defines a specific logical concept within the pharmaceuticalsindustry. In this example, “dimensions” can be conceptualized as a setof interrelated entities (e.g., sponsor, study, site and patient) whichcorrespond to specific interrelated industry concepts (e.g., thesponsors of clinical trials, studies for particular drug(s), site(s) atwhich the drug is tested, and patient(s) which participate in thestudy).

[0041] As used herein, “industry business context” is a set ofdimensions which define the data pertinent to an industry. In thepharmaceuticals example, the “industry business context” is the set ofdimensions which define all the data which is pertinent to thepharmaceuticals industry and which is to be accessible by a system ormethod of the present invention. Generally speaking, the complete set ofdata which is defined by the pharmaceuticals industry and which isaccessible by a method or system of the present invention is grouped orcategorized into logical concepts (dimensions), the complete set ofwhich defines the industry business context of the pharmaceuticalsindustry.

[0042] As used herein, “industry business context dimension” is usedinterchangeably with “dimension.”

[0043] As used herein, “dimension instance” is a particular embodiment(or record) of a dimension. In the pharmaceuticals industry example,wherein “patient” is a dimension, a particular person who is a patient(e.g., Joe Smith) is a “dimension instance” of that dimension.

[0044] As used herein, “data source specific dimension” is a specificlogical concept within a data source. In an embodiment, a “data sourcespecific dimension” is a set or subset of data contained within a datasource which is more abstract and meaningful in a business sense thanthe raw data stored in the data source. In the pharmaceuticals industryexample, a particular database contains data from a particular study,including data relating to the sites of the study, the patients in thestudy and EDC responses from the study. In this example, “data sourcespecific dimensions” can be conceptualized as a set of interrelatedentities (e.g., “study,” “site,” “patient” and “EDC responses”) whichcorrespond to the data in the data source and are defined for thatparticular data source. In an embodiment, a “data source specificdimension” is a set of conceptually-related data which can be retrievedfrom a data source or a plurality of data sources.

[0045] As used herein, “data source business context” is a set of datasource specific dimensions which define the data in a particular datasource. Generally speaking, the complete set of data which is containedin a data source is grouped or categorized into logical concepts (datasource specific dimensions), the complete set of which defines the “datasource business context” of the data source.

[0046] As used herein, “logic” is used generically and includes but isnot limited to hardware, software and/or combinations of both to performa function.

[0047] As used herein, “software” is used generically and includes butis not limited to one or more computer executable instructions,routines, algorithms, modules or programs including separateapplications or from dynamically linked libraries for performingfunctions as described herein. Software may also be implemented invarious forms such as a servlet, applet, stand-alone, plug-in or othertype of application. Software can be maintained on various computerreadable mediums as known in the art.

[0048] As used herein, “network” is used generically and includes but isnot limited to the Internet, intranets, Wide Area Networks, Local AreaNetworks and transducer links such as those using Modulator-Demodulators(modems).

[0049] In an embodiment, the present invention is directed to a system,method and database design for unifying data from a plurality ofheterogeneous databases, each having business-context related data and adata access mechanism. A database is created (the UniDimNet) whichcontains a node for each dimension of an industry. For each data sourcewhich is accessible via the system, a set of data source specificdimensions is created and mapped to the corresponding industry businesscontext dimension(s). A set of templates (UniViews) is created to querythe data sources. Each UniView contains a specific question for aspecific dimension designed for a specific data source. UniViews querythe database they are associated with by using the data access mechanismof the associated database. A central server (the UniServer) coordinatesthe system and facilitates use of the system through an interface (theUniViewer). The UniViewer allows a user to query the data sources byidentifying an industry business context dimension, a dimension instanceand at least one UniView. Multiple UniViews can be combined, cached andsaved to facilitate complex queries. Although the present invention isdescribed with reference to an exemplary set of databases relating toclinical trials for the pharmaceutical industry, those skilled in theart will readily appreciate that the invention will find application inany type of database management setting involving the management of aplurality of heterogeneous databases, for example, in the management ofheterogeneous databases involved with payroll or corporate humanresources applications.

[0050] With reference to FIG. 1, an overview of a system for unifyingdata 100 of the present invention is shown. In this embodiment, system100 includes UniBase 110, UniServer 120 and at least one data source130, and may further include any or all of Notifier 140, UniBuilder 150and UniViewer 160. System 100 exists on any suitable computer, computersystem or related group of computer systems known in the art. In anembodiment, UniBase 110 and UniServer 120 exist on a central server.Notifier 140, UniBuilder 150 and UniViewer 160 optionally also exist onthe central server. Data sources 130 are optionally located on thecentral server or on a remote computer or a remote computer system (notshown). In an embodiment in which any element disclosed herein islocated on a computer or computer system remote from other elements ofthe system, an appropriate electronic connection, including but notlimited to a network connetion, is established between the remoteelements to facilitate communication therebetween. Any appropriatenetwork or other communication method may be used. System 100 isembodied in any suitable programming language or combination ofprogramming languages, including database managers and SQL.

[0051] With reference to FIG. 2, system 100 of the present inventionincludes UniBase 110. UniBase 110 is a database which stores andfacilitates access to UniDimNet 210 and, optionally, UniView table 220and cached UniView results 230. UniBase 110 is any suitable database forstoring data and is embodied in any suitable database program, includingbut not limited to database software offered by Oracle.

[0052] With reference to FIG. 3, UniDimNet 210 is a database whichcontains a representation of all industry business context dimensionswhich are relevant to system 100 and their interconnections. UniDimNet210 further contains a representation of all data source dimensions forall data sources accessible by system 100 and their interconnections.UniDimNet 210 still further contains connections (or relations) betweenrepresented industry business context dimensions and represented datasource dimensions.

[0053] Each industry business context dimension which is used in system100 is represented by a UniDim, the complete set of which for system 100is represented at 310. A UniDim is an entry and description in theUniDimNet 210 database which represents and defines a unique industrybusiness context dimension within system 100.

[0054] The complete set of UniDims 310 can be represented in UniDimNet210 by any suitable mechanism. In an embodiment, each UniDim 380 is anode in a network which defines UniDimNet 210 and which acts as a singlepoint of reference for all information relating to the specific industrybusiness context dimension represented by the UniDim. The complete setof dimensions for an industry of system 100 is defined by any suitablemechanism. In an embodiment, the complete set of dimensions is definedby analyzing the industry and the industry's use of data sources todetermine which business concepts can be naturally grouped together ormost advantageously grouped together by relevance. In anotherembodiment, the complete set of dimensions is defined by analyzing thedata of a given industry to determine which concepts in the data aremost frequently referenced, cited and/or queried. In still anotherembodiment, the complete set of dimensions is defined by reviewing alldata sources accessible to a system 100 and determining logicalgroupings of the data based upon the business context of the industry,irrespective of the physical and/or logic groupings from the native datasources.

[0055] With reference to FIG. 4, an exemplary complete set of dimensionsfor the exemplary pharmaceuticals industry is illustrated. In thisexample, complete set of UniDims 310 contains a plurality of UniDims 410through 470 which define the pharmaceuticals industry business context.While only seven UniDims are represented in this example, it will beappreciated that any number of suitable UniDims may be defined, and itwill be further appreciated that many systems of the present inventionmay include a significantly greater number of UniDims as required todefine an industry's business context. In this example, UniDim 410represents the dimension “sponsor;” UniDim 420 represents the dimension“study;” UniDim 430 represents the dimension “site;” UniDim 440represents the dimension “patient;” UniDim 450 represents the dimension“patient in site;” UniDim 460 represents the dimension “visit;” andUniDim 470 represents the dimension “symptom entry.”

[0056] Each UniDim is related to at least one other UniDim. Asillustrated in FIG. 4, the UniDims of this pharmaceutical example areall roughly hierarchically related. UniDim 410 (sponsor) may relate toone to many studies (represented by UniDim 420—multiple UniDims 420 notshown), each study may relate to one to many sites (represented byUniDim 430—multiple UniDims 430 not shown) and each site may relate toone to many patients (represented by UniDim 450—multiple UniDims 450 notshown). Each UniDim 450 (i.e., each patient in the site) may relate toone to many “patient” UniDims 440 (representing information regardingthe patient, such as height, weight, etc.), one to many “symptom entry”UniDims 470 (representing patient symptom entries) and one to many“visit” UniDims 460 (representing patient visits). As graphicallyillustrated in FIG. 4, certain UniDims are deemed to be “higher” in thehierarchy than other UniDims. For example, UniDim 410 is “higher” in thenetwork of UniDims than UniDim 430. In an embodiment, each UniDim in thenetwork is viewed as one dimensional, while information regarding thehierarchical relationships between the nodes extends the network into asecond dimension.

[0057] With reference again to the UniDimNet 210 in FIG. 3, each UniDim380 in the complete set of UniDims 310 is mapped, or related, to atleast one data source specific dimension 390 contained in a complete setof data source specific dimensions (e.g., 320, 330 and 340)corresponding to a specific data source (e.g., with additional referenceto FIG. 1, data sources 172, 174 and 176) accessible to system 100. Asillustrated in FIG. 1, system 100 accesses a plurality of data sources130, as exemplified by data sources 172, 174 and 176. While three datasources are illustrated, it will be appreciated that any number of datasources may be accessed by a system 100 of the present invention.

[0058] With reference to FIGS. 1 and 3, data sources 172, 174 and 176may be any suitable data source containing information relevant to theindustry and accessed and/or accessible to system 100. With reference toFIG. 3, for each data source accessible by system 100 a complete set ofdata source specific dimensions (e.g., complete set 320 for data source172) is created. The complete set of data source specific dimensions forany data source may be determined by any suitable mechanism. In anembodiment, the internal data structure(s) of the data source and thebusiness information represented by the data is analyzed to determinethe logical “groups” or associations of data which define eachindividual data source specific dimension. In another embodiment, thebusiness context of each set of information within a data source isanalyzed to further define the data source specific dimensions for thedata source. In still another embodiment, the industry business contextdimensions contained in system 100 are consulted to determine whetherthe data in the data source can be conceptually grouped consistentlywith the industry business context. In still another embodiment (such aswith a relational database, as illustrated below), the internalstructure of the data source defines each business source specificdimension for a data source.

[0059] Exemplary data sources and corresponding data source specificdimensions are illustrated in FIGS. 5A, 5B and 5C. In this exemplaryembodiment, data source 172 (FIG. 5A) comprises a data source such asDATATRAK Central Administration (“DATATRAK CA”), an exemplary databaseprovided by DATATRAK International, Inc. In the DATATRAK CA, originaldata is either stored in a dedicated table, a set of tables or in ageneric table. The complete set of data source specific dimensionsdepends upon the nature of the information stored in these tables andthe type of information accessed from system 100. In an embodimentwherein information to be accessed by system 100 is contained indedicated tables within DATATRAK CA, pre-defined classes relating to thededicated tables define the data source specific dimensions. Theclasses, as with the corresponding data source specific dimensions, canspan across a set of tables to select data from each to facilitatebusiness-context grouping, thereby allowing decoupling of the originaldata (and original data tables) from the business context. In anembodiment wherein information to be accessed by system 100 is containedin generic tables within DATATRAK CA, data source specific dimensionsare derived from the content of specific fields within the originaldata. Specific fields are grouped based upon the business-context of thefields. Once the complete set of data source specific dimensions isdefined for data source 172 and the specific data source specificdimensions thereof created, the complete set 320 is stored in UniDimNet210.

[0060] With reference to FIG. 5A, exemplary data source specificdimensions for exemplary data source 172 (DATATRAK CA) include “CASponsor,” “CA Study” and “CA Site.” These data source specificdimensions relate to conceptual groupings of data within DATATRAK CAwhich relate to, respectively, sponsors in the data source, studies inthe data source and sites in the data source. Each data source specificdimension which is used in system 100 is represented in the UniDimNet210 as a DataSourceDim, the complete set of which for data source 172 isrepresented at 320. A DataSourceDim is an entry in the UniDimNet 210database which represents a unique data source specific dimension withinsystem 100. Data source specific dimension “CA Sponsor” is representedby DataSourceDim 510, data source specific dimension “CA study” isrepresented by DataSourceDim 512 and data source specific dimension “CASite” is represented by DataSourceDim 514. An additional DataSourceDim,“Central Admin” 516, is defined for DATATRAK CA as a specialDataSourceDim which refers to the data source itself (DATATRAK CA). EachDataSourceDim in the complete set for a particular data source is linkedto the special DataSourceDim for that data source.

[0061] Further in this exemplary embodiment, data source 174 comprises adata source which is organized to facilitate generic SQL data queries.In such a SQL database (generally speaking, a relational database), theoriginal data of the database is stored in tables which directlyrepresent a particular business-context grouping. In this regard, thedata source specific dimensions for such a data base may be deriveddirectly from the tables in which the data is stored. For example, thedatabase may contain a table for “patient” which contains patientinformation. Such a table may be consistent with a data source specificdimension for “patient.” Alternatively, data source specific dimensionsmay be defined which span multiple tables, as long as the data selectedfrom each table is related in a business context. Once the complete setof data source specific dimensions is defined for data source 174 andthe specific data source specific dimensions thereof created, thecomplete set 330 is stored in UniDimNet 210 as representativeDataSourceDims. With reference to FIG. 5B, exemplary DataSourceDims 520,522 and 524 represent data source specific dimensions “PD Patient,” “PDPatient in Site” and “PD Symptom Entry” respectively. As with completeset 310 for data source 172, special DataSourceDim 526 referring to datasource 174 is created in complete set 330 for data source 174.

[0062] Still further to this exemplary embodiment, data source 176comprises a data source which is fully generic and not designed toinclude any specific tables for any specific concepts, such as exemplaryDATATRAK EDC Database (DATATRAK EDC) provided by DATATRAK International,Inc. Such a database is accessed by an API (such as DATATRAKQUESTIONVIEW®) and does not contain any internal structure which wouldaid in creating data source specific dimensions. In such a case, datasource specific dimensions are created by any suitable mechanism. In anembodiment, data source specific dimensions for such a database arecreated by accessing metadata regarding the data stored in the databaseand by analyzing the data contained in the database in light of theindustry business context. Once the complete set of data source specificdimensions is defined for data source 176 and the specific data sourcespecific dimensions thereof created, the complete set 340 is stored inUniDimNet 210 as representative DataSourceDims. With reference to FIG.5C, exemplary DataSourceDims 530, 532 and 534 represent data sourcespecific dimensions “EDC Site,” “EDC Patient in Site” and “EDC Visit”respectively. As with complete set 310 for data source 172, specialDataSourceDim 536 referring to data source 176 is created in completeset 340 for data source 176.

[0063] With reference again to FIG. 3, once the complete sets ofDataSourceDims (as illustrated by exemplary complete sets 320, 330 and340) have been created and stored in the UniDimNet 210, eachDataSourceDim therein is mapped, or linked, to a corresponding UniDim inthe complete set of UniDims 310. With reference to FIG. 6A, mapping ofDataSourceDim complete set 350 to a portion of the complete set ofUniDims 310 is illustrated. DataSourceDim 510, which relates to the datasource specific dimension “CA Sponsor,” which itself relates to CASponsor information contained with in the data source DATATRAK CA, isrelated to UniDim 410 “sponsor,” which relates to the industry businesscontext dimension “sponsor.” In this regard each DataSourceDim exceptthe special DataSourceDim referring to the data source is related to acorresponding UniDim. A UniDim is “corresponding” if it relates to thesame or similar dimension as the data source specific dimension.Similarly, DataSourceDim 512 “CA study” is related to UniDim 420 “study”and DataSourceDim 514 “CA Site” is related to UniDim 430 “site.”

[0064] With reference to FIGS. 6B and 6C, similar mapping betweenDataSourceDim complete sets 360 and 370 and UniDim complete set 310occurs, with each DataSourceDim therein linked to a correspondingUniDim. In FIG. 7 the mapped relationships illustrated in FIGS. 6A, 6Band 6C are combined. As illustrated therein, certain UniDims (e.g.,Patient in Site 450) are mapped to more than one DataSourceDim. Thisoccurs when more than one data source contains information relating to aparticular industry business context dimension (as represented by aUniDim).

[0065] In an embodiment, each DataSourceDim is a node in the networkwhich defines UniDimNet 210, similar to each node defined by eachUniDim. By mapping each DataSourceDim node onto the two dimensionalUniDimNet network defined by the UniDims and their relationships witheach other, the UniDimNet network is expanded into three dimensions. Inthis context the UniDimNet 210 is a three-dimensional network of nodesstored as related data in the UniBase 110.

[0066] In another embodiment, UniDimNet 210 is a series of interrelatedtables, with each node in the UniDimNet 210 being represented by atable. As such, each UniDim and each DataSourceDim in the UniDimNet 210are represented by a table. Each dimension instance in a data sourceaccessible by system 100 is represented by a row in at least one Unidimof the UniDimNet 210 and at least one DataSourceDim of the UniDimNet210. With reference to FIG. 8, for example, a dimension instance 800 indata source 176 (DATATRAK EDC) is a particular site in a study fromwhich data is captured by DATATRAK EDC (i.e., dimension instance 800 isan instance of industry business context dimension 430 “site”).Dimension instance 800 is represented in table 860 for DataSourceDim 530(“EDC Site”) by row 862 and in table 850 for UniDim 430 (“Site”) by row852.

[0067] Each UniDim table contains an entry (e.g., a row in the table)for each dimension instance of the dimension represented by the UniDimwhich is contained in a data source which is accessible by sytem 100.Each entry (e.g., row 852) in the UniDim table contains a globallyunique identification 853 which uniquely identifies the dimensioninstance represented by the entry. Each entry may also contain acreation timestamp 854 and an update timestamp 855 for each entry,representing the time the dimension instance was entered into system 100and the last time the dimension instance was updated, respectively. Eachentry may further contain additional information (i.e., additionalrecord fields) regarding each dimension instance in the table. Anysuitable information may be added. For example, a field 856 identifyingthe DataSourceDim related to the dimension instance may be added. Itwill be appreciated that the amount of information contained in eachUniDim entry depends upon the response time desired for system 100 andthe data storage space available for the UniBase. In general, the moreinformation that is contained in each UniDim entry, the fewer the datasource look-ups need to be performed by system 100 (because much of theinformation regarding the dimension instance will already be stored inthe UniDim entry), thus generally speeding up performance of the system.However, such additional information uses additional storage space,which may add to the cost of the system and, depending upon the databasemanagement software used to maintain the UniDimNet, may slow the systemdown if the size of the UniDim tables becomes too large.

[0068] Similarly to the UniDim tables, each DataSourceDim table containsan entry for each dimension instance of the dimension represented by theDataSourceDim which is contained in the data source which is related tothe DataSourceDim. Each entry (e.g., row 862) in the DataSourceDim tablecontains a reference 864 to the data source in which the dimensioninstance is contained, key information 863 (such as a data source uniqueidentifier) required to retrieve the dimension instance (and the dataassociated therewith) from the data source, and the uniqueidentification 853 for the dimension instance as contained in the UniDimentry relating to the dimension instance. Each entry may also contain acreation timestamp 854 and an update timestamp 855 similarly to therelated UniDim entry. Also as with UniDim entries, DataSourceDim tableentries may contain additional information relating to the particulardimension instance, and may further contain additional informationregarding the data source and the UniDim to which the DataSourceDim isrelated. It will be appreciated that the same factors relating to speedand size which dictate the amount of information contained within aUniDim entry are also applicable to determining the amount ofinformation contained with a DataSourceDim entry.

[0069] With reference again to dimension instance 800, an entry relatingto dimension instance 800 is contained in DataSourceDim table 860 (entry862) and in UniDim table 850 (entry 852). These entries are related toeach other by any suitable mechanism. In one embodiment, the relation iscontained in a DataSourceDim identifying field 856 of a UniDim record.In another embodiment, the relation is further defined by each entrycontaining the same UniDim unique identification 853.

[0070] A UniDim table may be related to more than one DataSourceDimtable. With further reference to FIG. 8, UniDim table 850 is alsorelated to DataSourceDim table 870 representing DataSourceDim 514 (“CASite”), which is related to data source 172 (DATATRAK CA). For example,dimension instance 880 in data source 172 (DATATRAK CA) is a particularsite in a study from which data is stored in DATATRAK CA (i.e.,dimension instance 800 is an instance of industry business contextdimension 430 “site”). Dimension instance 800 is represented in table870 for DataSourceDim 514 (“CA Site”) by row 872 and in table 850 forUniDim 430 (“Site”) by row 858. It will be appreciated that in thisexample UniDim table 850 contains entries which relate to two differentDataSourceDim tables (and subsequently to two different data sources).UniDim table 850 can contain entries which relate to any number ofDataSourceDim tables, as long as the dimension instance (to which theDataSourceDim relates) relates to the UniDim (to which the UniDim tablerelates). Furthermore, a UniDim table can contain multiple entries whichrelate to multiple entries in a single DataSourceDim table (i.e.,representing multiple dimension instances in a single data source). Inthis event, an entry for each dimension instance will be stored in boththe DataSourceDim and the UniDim tables relating to the dimensioninstance.

[0071] With still further reference to FIG. 8, it will be appreciatedthat each DataSourceDim table also may contain an entry (e.g., 890 fortable 860 and 892 for table 870) referring to the data source itself.Such an entry relates to the special DataSourceDim (e.g., with referenceto FIG. 5, DataSourceDim 516 (“DATATRAK CA”)) which contains informationrelating to a particular data source. Such an entry may containinformation similar to other entries in the database, and optionally maycontain additional information relating to a data source (e.g.,information relating to the data source's access mechanism or,generally, connection information relating to the data source). Eachdata source which relates to a DataSourceDim has a special entry in theDataSourceDim which relates to itself. Each DataSourceDim node thuscontains information for all real data sources of the correspondingtype.

[0072] In an embodiment, each dimension instance contained within alldata sources accessible to system 100 has a corresponding entry in atleast one UniDim table and at least one DataSourceDim table. UniDimNet210 thus facilitates querying of dimensions and dimension instancesspanning all data sources irrespective of the physical and logicalstructure of each data source. Such queries (relating to dimensions ordimension instances, not to particular data sources or specific recordsin each data source) may be performed by templates, or UniViews.

[0073] A UniView is logic (e.g., a software component, routine orobject) which performs an actual data request to a data source withinsystem 100. A UniView is a specific question for a specific dimensiondesigned for a specific data source. With reference to FIG. 9, exemplaryUniView 900 communicates with data access mechanism 910 to facilitateaccess to and query of a data source (here, exemplary data source 172).In an embodiment, a UniView takes the form of a function call:

result=exact_request_for_information (instance_parameter)

[0074] wherein “result” is the requested information which is returnedfrom the data source in response to the query,“exact_request_for_information” is the specific request (query) to thedata source for specific information, and “instance_parameter” is thespecific dimension instance the request regards. For example, a UniViewquerying for the height of patient “Joe Smith” would define “result” asbeing a field for containing the value of Joe Smith's height and couldtake the form of:

result=what is the height of the patient (patient=“Joe Smith”)

[0075] Upon a successful query to the appropriate data source, theexemplary UniView would return the value of Joe Smith's height asrecorded in the data source.

[0076] In an embodiment, the “result” and the “specific request” of aUniView is created and stored while the “instance parameter” is left asa variable, thus allowing the UniView to be used and reused each timethe same question for the same dimension for the same data source ismade (a value for the “instance parameter” may be passed to the UniViewin order to complete the UniView). In this manner, a single UniView maybe selected and passed multiple instance parameters to effectuatemultiple queries to the same data source for multiple dimensioninstances.

[0077] Each UniView is created for a specific data source. In anembodiment, upon incorporating a data source into system 100, aplurality of UniViews are created in system 100 for querying the newdata source. Each UniView contains the necessary information andinstructions to facilitate access to a data source via the data accessmechanism for that particular data source. For example, exemplary datasources DATATRAK EDC and DATATRAK CA are accessible via the DATATRAKQUESTIONVIEW API. A UniView for either of these data sources will becreated with the ability to access and use the DATATRAK QUESTIONVIEW APIfor querying each database. The UniView will contain the requiredparameters, instructions and information necessary to instruct the APIto query the databases and return certain results (in the format of“results”). In this sense the UniDimNet 210 is removed from particulardetails of the structure and physical requirements of each data source.The UniViews receive a dimension-specific query and facilitate access toa data source to respond to the query. While the above example hasillustrated use of an API as a data access mechanism to a data source,it will be appreciated that any data access mechanism which is capableof querying a data source may be used.

[0078] It will be appreciated that the number and extent of UniViewswhich are created for any specific data source depends upon the numberand type of dimension instances within the data source and a user'sdesire to query the data source. Any suitable number of UniViews for aparticular data source may be created and used in system 100. It will beappreciated that to the extent a data source has voluminous datarepresenting many instances, numerous UniViews will be created. UniViewsmay be stored in any appropriate element (or database) of system 100such as, e.g., within UniBase 110. In an embodiment, UniViews (or“definitions” of UniViews) are stored in (with reference to FIG. 2)definition of UniViews database 220 of UniBase 110. In anotherembodiment, with reference to FIGS. 1 and 12, a plurality of UniViews isstored in UniView Tree 1210 of UniBuilder 160. UniView Tree 1210 maycontain a listing of all UniViews created for system 100. The listingmay be organized in any suitable manner for facilitating searching ofand access to UniViews of the system. In an embodiment, the UniViews inUniView Tree 1210 are topically organized by dimension. In anotherembodiment, the UniViews are organized first by data source, then bydimension. In yet another embodiment, the UniViews are hierarchicallyorganized. UniBuilder 160 may use manage UniView Tree logic 1220 tofacilitate management of the UniView Tree. Manage UniView Tree logic1220 includes any suitable steps, methods and/or processes forfacilitating management of the UniView Tree 1210, including but notlimited to logic for adding UniViews to the Tree, deleting UniViews fromthe Tree, reorganizing the Tree, searching the Tree, structuring theTree and otherwise facilitating change to the Tree. As also furtherdiscussed below, UniView Tree 1210 may also contain UniViewInterfacesand CompoundUniViews. While the UniView Tree 1210 has been described asfacilitating organization of the UniViews of a system 100, it will beappreciated that the UniViews themselves may be stored in the UniViewTree 1210 or alternatively may be stored in a different location (ordatabase) with only identifying keys associated with each UniView beingorganized in the UniView Tree 1210.

[0079] System 100 optionally includes additional functions to assist inthe management, use and creation of UniViews, including but not limitedto using a UniViewInterface, using CompoundUniViews, UniView caching andUniView creation via an automated process such as, e.g., a UniBot.UniViewInterfaces are illustrated with reference to FIG. 10.

[0080] A UniViewInterface is a mechanism for combining multiple UniViewsto facilitate queries which would require use of multiple UniViews. AUniViewInterface takes generally the same form as a UniView (i.e., afunction call) wherein a plurality of UniViews are called by the singleUniViewInterface. However, generally the “result” of a UniViewInterfaceis a result set, the contents of which is a collection (or array ortable) of retrieved data which corresponds to a plurality of dimensioninstances retrieved by the plurality of UniViews which are called by theUniViewInterface. Furthermore, the “exact_request_for_information” in aUniViewInterface is not data source specific, as it is in a UniView.Instead, the “exact_request_for_information” relates only to theinformation requested (e.g., from a dimension), irrespective of the datasource. Still furthermore, the “instance_parameter” is an array ofdimension instances, rather than a single dimension instance. The arrayof dimension instances is defined by the set of dimension instanceswhich are desired to be queried.

[0081] Use of a UniViewInterface is illustrated with reference to FIG.10. In the example of FIG. 10, three separate data sources (Study DB1040, EDC Study I DB 1042 and EDC Study II DB 1044), each with differentphysical and logical structures and employing different accessmechanisms, each contain information relating to the dimension “patientcharacteristics” (such as, e.g., the age, height, weight, etc. of theparticipants in each study). A UniView may be used to query eachdatabase individually for a single dimension instance. When multipledimension instances occur within each data source, multiple UniViewswould need to be used to access each instance. The number of UniViewsrequired to fully query all three data sources would be time consumingfor a user to implement. A UniViewInterface 1010 facilitates multipleUniView querying with a single user query.

[0082] UniViewInterface 1010 is created with “result” as a collectionfor containing patient characteristic information for a plurality ofpatients. The “result” of a UniViewInterface 1010 query will be acollection of patient characteristics data, with each patient being anentry in the “result.” The “exact_request_for_information” is a query ofthe dimension “patient” 440 (with reference to FIG. 4) for patientcharacteristics (it does not specify a particular data source). The“instance parameters” is an array of patient names or other suitablepatient identifiers. For a user who is desirous of receiving patientcharacteristics for patient 123 1000 (from Study), patient abc 1002(from EDC I) and patient xyz 1004 (from EDC II), the user passes theidentifiers (e.g., the names) of patient 123, patient abc and patientxyz to the UniViewInterface 1010 as values of the array “instanceparameters” (i.e., the patients to be queried are passed to theUniViewInterface). Of note, the user does not need to pass the identityof the data source of each patient to the UniViewInterface. Uponreceiving values in array “instance parameters,” the UniViewInterface1010 queries the UniDim in the UniDimNet 210 which corresponds to thedimension “patient” (the UniViewInterface has been coded to search thisdimension as the UniViewInterface 1010 is for “patient characteristics”)based upon or for the unique identification of each dimension instancein the array. Once the UniViewInterface 1010 obtains this informationfor each dimension instance, the UniViewInterface 1010 queries theDataSourceDim associated with each dimension instance (note: once theinstance is found in the UniDim an association for the respectiveDataSourceDim already exists) to determine the proper data sourceassociated with each dimension instance. With this information, theUniViewInterface 1010 calls multiple UniViews. Each called UniView ispassed a single “instance_arameter” as a single instance from the arrayof “instance_Parameter” of the UniViewInterface 1010, and the UniView towhich the single instance is passed is selected by the nature of theUniViewInterface (i.e., it is a UniView for “patient characteristics,”exactly as the UniViewInterface) and the identity of the related datasource. In the example, UniViewInterface 1010 thus calls UniViews 1020,1022 and 1024 for, respectively, patient 123, patient abc and patientxyz. UniView 1020 accesses data source 1040 via access mechanism 1030,UniView 1022 accesses data source 1042 via access mechanism 1032 (e.g.,the DATATRAK QUESTIONVIEW API) and UniView 1024 accesses data source1044 via access mechanism 1034. The resulting collection of patientcharacteristic data is returned to the user in response to theUniViewInterface query. In an embodiment, authored UniViewInterfaces ofsystem 100 are stored and/or organized in UniView Tree 1210.UniViewInterfaces are examples of complex queries.

[0083] With reference to FIG. 11, a CompoundUniView is a mechanism forstoring for future use a series (or combination) of UniViews. It may bedesirous for a user to create a complex query in which multiple UniViewsspanning multiple dimensions and multiple data sources are used. Oncecreated, such a complex query (comprising multiple UniViews) can bestored as a CompoundUniView and subsequently used with differentinputted instance parameters. In the example of FIG. 11, a user desiresa “quality of life” query across all data sources for a particularpatient. The user defines the “quality of life” query as including“patient diary gastric” and “patient diary pain level” from data sourcepatient diary 1140, “EDC medication” from data source 1142 and “patientmanager history” from data source 1144. The user thus combines fourUniViews to create this complex query. The user selects UniViews 1120and 1122 to use access mechanism 1130 to query data source 1140, UniView1124 to use access mechanism 1132 to query data source 1142 and UniView1126 to use access mechanism 1134 to query data source 1144. Thiscomplex query is saved as CompoundUniView “Quality of Life” 1110. When asubsequent user is desirous of querying system 100 for the “Quality ofLife” of a particular patient, the user need only select CompoundUniView“Quality of Life” 1110 and pass to it the desired value of theinstance_parameter (e.g., “patient 123” 1100). The CompoundUniView 1110assembles UniViews 1120, 1122, 1124 and 1126, passes to each of them theinstance_Parameter, receives the “result” from each, and returns as a“result” the combined information returned by each UniView. In thissense, the only input parameters (i.e., data provided by a user todefine a query or the scope of the query) required from a user for aCompoundUniView is the identity of the CompoundUniView (i.e., thedescription of the data to be queried, e.g., “weight of patient”) andthe identity of the dimension instance(s) to be queried. A user may alsoinput a parameter regarding the result desired (i.e., the format of thereturned data). In an embodiment, authored CompoundUniViews of system100 are stored and/or organized in UniView Tree 1210. CompoundUniViewsare examples of complex queries.

[0084] UniView caching is a mechanism for speeding up system 100response time by caching the results of executed UniViews. In subsequentexecutions of such UniViews, the cached results are analyzed todetermine if an update to the corresponding data source has occurredsince the time of the caching. Generally speaking, if no update hasoccurred, the cached results can be returned for the execution of theUniView, thus saving the time and system resources required foraccessing the data source directly in response to the UniView. If anupdate has occurred, the cache is ignored, the UniView queries the datasource, and the response to the query is cached over the old cacheddata.

[0085] To facilitate such caching, in an embodiment, system 100 createsa cache results table for each UniView of system 100. The cached resultstables may be stored in any suitable location within system 100. In anembodiment (with reference to FIG. 2), UniBase 110 has a cached UniViewresults database 230. Database cached UniView results 230 is anysuitable database, with any suitable organization, for storing cachedresults from UniView queries. In an embodiment, cached UniView resultsdatabase 230 contains a plurality of tables, each table being associatedwith a specific UniView of system 100. Each table contains the “result”data from the most recent query executed by the UniView associated withthe table. In an embodiment, each caching instance in a table isappended with a time stamp which indicates the date and time the datawas cached. An exemplary system use of UniView caching will be describedfurther below. In an embodiment, creation of the cached results tablesis facilitated by (with reference to FIG. 12) UniView table logic 1250of UniBuilder 160 (discussed in more detail immediately below withregard to UniView creation). UniView table logic 1250 includes anysuitable steps, processes, method and/or software code to facilitatecreation of, access to and management of the cached results tables.

[0086] UniView creation can be afforded by any suitable mechanism,including manually (i.e., a single UniView is coded by a user). In anembodiment, system 100 provides tools to assist in the creation ofUniViews. With reference to FIG. 12, UniBuilder 160 includes data sourceclass logic 1230 and UniBot 1240.

[0087] In an embodiment, data source class logic 1230 assists increation of UniViews. Upon becoming accessible to system 100, a datasource class is defined which sets forth the necessary information,steps, processes and access mechanisms for querying data source(s) ofthat class. The data source class contains information required by aUniView to create the “exact_request_for_information” element of theUniView (i.e., the specific information required by the UniView tofacilitate a query to the data source via an appropriate data accessmechanism). This information is formatted to facilitate use in a UniViewdirected to querying a data source of this class. In this manner, whenit is desirous to create a UniView which queries a data source of thisclass, the author of the UniView can “port” or “copy” the formatted dataclass information into the UniView, thus saving time in re-creating thesame code for each such UniView. To facilitate the creation of the datasource class, (with reference to FIG. 1), UniBuilder 160 receivesinformation regarding data source(s) from data source 130. Data sourceclass logic 1230 includes any suitable steps, methods, processes and/orcode to facilitate creation of a data source class, formatting of a datasource class, storage of a data source class, access to a data sourceclass, and porting of a data source class to a UniView. Data base classinformation may be stored in any suitable location of system 100.

[0088] In an embodiment, UniBot 1240 facilitates generation of a set of“standard” UniViews for a data source. UniBot 1240 includes any suitablesteps, methods, processes and/or code to facilitate such generation.UniBot 1240 may optionally be automated. Based upon user input regardingwhat a “standard” set of UniViews includes (e.g., how may UniViews for adata source of such a class; which dimensions are to be queried; whatdata is routinely queried from data sources of such a class; etc.),UniBot 1240 accesses the relevant data source(s) and determines theinformation necessary to create a standard set of UniViews. In anembodiment, UniBot 1240 access the relevant data sources and determinesthe information necessary to create the DataSourceDims for the datasource for incorporation into a UniDimNet. In a embodiment, UniBot 1240retrieves such information and creates a set of UniViews according touser-defined specifications.

[0089] With reference to FIG. 1, coordination among and between UniBase110, UniBuilder 160, data sources 130 and other system 100 elements(discussed below) is facilitated by UniServer 120. With reference toFIG. 13, UniServer 120 optionally includes manage UniDimNet logic 1300,manage data source logic 1305, route connections logic 1310, compoundUniView logic 1320, snapshot and versioning logic 1325 and logic forfacilitating external data access 1330. It will be appreciated thatUniServer 120 can optionally contain additional elements as necessary tofacilitate operation of system 100.

[0090] In an embodiment, manage UniDimNet logic 1300 facilitatesmanagement of the UniDimNet. Manage UniDimNet logic 1300 includes anysteps, processes, methods and/or software code to facilitate adding,updating and deleting UniDims and DataSourceDims from the UniDimNet,distributing UniDimNet structure to the UniBase, and other actionsrelating to the UniDimNet not otherwise facilitated by other elements ofsystem 100. In an embodiment, manage data source logic 1305 facilitatesmanagement of the data sources accessible to system 100. Manage datasource logic 1305 includes any steps, processes, methods and/or softwarecode to facilitate management of the data sources and the relationships(including interactions) with the dimensions. In another embodiment,route connections logic 1310 manages connections between elements ofsystem 100. Route connections logic 1310 includes any steps, processes,methods and/or software code to facilitate the routing of connections(including communications) between system 100 elements, including butnot limited to the UniBase, the data sources and any user(s).

[0091] In a further embodiment, UniServer 120 optionally includesUniView query logic 1315 for coordinating system 100 actions andinteraction during execution of a UniView. UniView query logic 1315includes any steps, processes, methods and/or software code tofacilitate coordination of system 100 resources during execution of aUniView. In an embodiment, UniView query logic 1315 is configured tofacilitate one or more than one of the following UniView executionconfigurations: (1) ShowSourceBasedDataOnly. Under this configuration(exemplified in FIG. 14), a UniView will first check its associatedtable to determine if data is already cached in the table (step 1400).If no data is cached, the UniView queries the appropriate data sourceand returns a result therefrom (step 1410). The resulting data is cached(step 1420) and a timestamp for the cache is set (step 1430). If datahas been cached, the time stamp of the cached data is compared to thetime stamp of the corresponding DataSourceDim (step 1440). If theDataSourceDim time stamp is younger than the cached time stamp, thecache is ignored (step 1450) and the UniView queries the data source(step 1410). In the alternative, the cached data is retrieved (step1460) and a query to the data source is not necessary. (2)ReceiveAlwaysFromDataSource. Under this configuration, a UniView willalways query a data source for a result, and does not check itsassociated table for cached data. (3) ShowUniBaseDataOnly. This is theopposite configuration from ReceiveAlwaysFromDataSource. Under thisconfiguration, the UniView will always use cached data in the table,even if it is out of date. Under this configuration the UniView does notquery the data source. (4) ShowUnstableData. Under this configuration,the UniView will first check its associated table for cached data. Ifcached data exists, it will be returned as a result even if out of date.The UniView will continue processing in the background, similarly to theprocess set forth for ShowSourceBasedDataOnly, and will revise thereturned result with data from the data source if the cached data is outof date. If, upon initial checking of the table, no cached data existstherein, the UniView will continue its background processing (i.e., itwill query the data source). (5) DefaultBehavior. Under thisconfiguration, the UniView itself contains code designating how itshould process its query. In this instance, UniView query logic 1315follows the steps contained in the UniView. While certain alternativeconfigurations for UniView query logic 1315 have been set forth herein,it will be appreciated that any suitable configuration for UniView querylogic 1315 may be used.

[0092] In an embodiment, compound UniView logic 1320 facilitatesprocessing of a CompoundUniView. Compound UniView logic 1320 includesany steps, processes, methods and/or software code to facilitateprocessing (execution) of a CompoundUniView. Particularly, compoundUniView logic 1320 manages execution of a CompoundUniView and furtheracts as a virtual data source therefore. For each UniView which iscalled by a CompoundUniView, compound UniView logic 1320 performs atable cache check and (depending upon the nature of the cached data, ifany) a data source query similarly to steps illustrated above forShowSourceBasedDataOnly. While compound UniView logic 1320 has beendescribed herein with relation to a ShowSourceBasedDataOnlyconfiguration, it will be appreciated that compound UniView logic 1320may be configured to follow any suitable configuration.

[0093] In an embodiment, snapshot and versioning logic 1325 facilitatesretaining “snapshots” of UniView query results and further facilitateslabeling such snapshots with version identifiers. Snapshot andversioning logic 1325 includes any steps, processes, methods and/orsoftware code to facilitate creating snapshots and versions of UniViewquery results. When a UniView query result is returned, it is optionallystored in a table corresponding to the UniView in the UniBase. Under oneoptional configuration of the UniServer, this cached data is overwrittenthe next time the same UniView returns an updated result from a query.Snapshot and versioning logic 1325 optionally allows any cached data toremain in the table. In an embodiment, a particular cached result isstored as an entry (e.g., a row) in the cache table. Snapshot andversioning logic 1325 facilitates subsequent returned results beingstored as additional row(s) in the cache table. Furthermore, such cachedresults can be labeled with versioning identifiers to facilitate versioncomparisons. In this regard, multiple “snapshots” (i.e., former returnedresults from earlier executions of the UniView) are retained in thecache table, and may be compared to each other (e.g., for versioncomparision).

[0094] In an embodiment, logic for facilitating external data access1330 facilitates access to system 100 by an external application, suchas an OLE database data provider or an ODBC source. Logic forfacilitating external data access 1330 includes any steps, processes,methods and/or software code to facilitate such access by an externalapplication. In this regard system 100 can be used to function as adirect data supplier for queries from third-party systems. In anembodiment, such systems need not be configured in any specific way(other than enabling an ODBC connection, for example) to be able toaccess the data sources of system 100 and the querying power of system100.

[0095] The dimension instances in the data sources are rarely static. Toincorporate changes made to a dimension instance in a data source,system 100 optionally includes (with reference to FIG. 1) notifier 140which communicates with UniServer 142 regarding changes which occur todimension instances in the data sources 130. With regard to FIG. 15,notifier 140 optionally includes a rule book 1500 and change workflowlogic 1510. Rule book 1500 is a data base which contains rulesassociated with each data source which is accessible by system 100. Foreach data source, the associated rules define the necessary steps to betaken by the system to incorporate a change in a dimension instance(e.g., an addition, a deletion or a modification) into the system (e.g.,into the UniDimNet). The rule(s) for any data source are any suitablerules to facilitate integration of the dimension instance change intosystem 100. In an embodiment, the rules define the minimum requiredinformation which must be retrieved from the dimension instance andpassed into the system 100 for integration. Generally speaking, suchinformation is retrieved from data source 130 by notifier 140 and passedto UniServer 120 for assimilation into system 100. The rules for a datasource may be arbitrarily complex, ranging from minimum (e.g., a selectof the unique identification of the dimension instance) to an involvedcomplexity (e.g., using a cascading set of queries to automatically filla complete dimension hierarchy for the data source). Furthermore, therules may designate that any time stamps in system 100 for the dimensioninstance be updated upon a modification.

[0096] Change workflow logic 1510 optionally facilitates triggering ofnotifier 140 and referral to rule books 1500. Change workflow logic 1510includes any steps, processes, methods and/or software code tofacilitate triggering of notifier 140 and referral to rule books 1500.Any change to a dimension instance can be considered a “triggeringevent” which triggers notifier 140, and, specifically, change workflowlogic 1510. It will be appreciated that a triggering event may includebut not be limited to creation of a dimension instance, deletion of adimension instance or any change to a property of a dimension instance,including the value of any data therein. Upon occurrence of a triggeringevent, change workflow logic 1510 is triggered. Change workflow logic1510 accesses rule books 1500 to determine what workflow steps are to beimplemented in order to assimilate the modified dimension instance intosystem 100. Change workflow logic 1510 performs and/or facilitates allsteps set forth in the rule book to incorporate the modified instanceinto system 100. While change workflow logic 1510 has been illustratedas an element of system 100 outside of the UniServer 120, it will beappreciated that change workflow logic 1510 (and all of notifier 140)can optionally be included in UniServer 120.

[0097] User access to and use of system 100 can be achieved by anysuitable user interface. In an embodiment, with reference to FIG. 1,system 100 further includes UniViewer 150. UniViewer 150 facilitatesuser access to system 100. Particularly, UniViewer 150 allows users toformulate queries to system 100 by combining UniViews, dimensions anddimension instances to form simple or complex queries. Generallyspeaking, UniViewer 150 is a graphical environment wherein queries areconstructed by dragging query components (dimension(s) and UniView(s))onto a “result” area and results are viewed in the result area byselecting particular dimension instances.

[0098] With reference to FIGS. 16 through 19, an exemplary UniViewer 150is illustrated. Upon launch of UniViewer 150 by a user, the userinitially determines whether to work with an existing query or whetherto begin a new query. If an existing query is selected, the query (andresult(s), if any) is retrieved and displayed by the graphical userinterface (GUI). If a new query is desired, the user selects at leastone industry business context dimension with which to begin the query.

[0099] With reference to FIG. 16, an exemplary GUI is displayed.User-selected dimension “site” 1610 is displayed and all UniViews 1620available for querying dimension “site” are displayed. In this example,UniViews 1620 “EDC Namespace,” “Study” and “CRFDefinition” are groups ofUniViews (as designated by the “+” to the left of each identifier) whilethe remaining identifiers relate to individual UniViews. Upon the userselecting at least one dimension at the beginning of this user-session,UniViewer 150 displays 1630 each dimension instance which is containedin the UniDim which relates to the selected dimension. Any suitableprocess for retrieving such instances may be used. In an embodiment, theUniServer access the unique identifications for each instance as listedin the UniDim for the selected dimension. Upon retrieving the uniqueidentifications, the UniServer can retrieve the dimension instances anddisplay them at 1630. A result area 1640 (currently empty) is alsodisplayed.

[0100] With reference to FIG. 17, the user selects any number ofUniViews with which to query the data source(s). In FIG. 17, the userhas selected the UniViews “Description,” “AdminID” and “NaSpID,” whichhave been dragged onto results area 1640. Upon receiving each draggedUniView, results area 1640 creates a column for the anticipated “result”of the UniView. In FIG. 17, results area 1640 displays a descriptioncolumn 1710 for the results of the “description” UniView, an AdminIDcolumn 1720 for the results of the “AdminID” UniView, and an “NaSplD”column 1730 for the results of the “NaSpID” UniView. Of note, draggingmultiple UniViews onto the results area 1640 exemplifies a complexquery.

[0101] Once the query has been defined by the user by selectingUniView(s), with reference to FIG. 18, the user may select one or moredimension instances 1630 to pass to the queries in the results area 1640(of note, passing more than one dimension instance exemplifies a complexquery). In FIG. 18, the user has selected the “Stadt Klinik Bonn”instance. Upon selection of this instance, the UniViewer 150 passes thisinstance to the UniServer. The UniServer facilitates execution of thequery. The result of each UniView query, upon return of results, isdisplayed 1810 in results area 1640 (i.e., the result of the“Description” UniView is “Krhs.”, the result of the “AdminID” UniView is“11” and the result of the “NaSpID” UniView is “5”).

[0102] While the above example has been illustrated with UniViews whichrepresent properties of the dimension selected (e.g., each of the threeselected UniViews returned properties of the dimension selected—“site”),it will be appreciated that selectable UniViews (1610) may have multipleways of being associated with the selected dimension. For example, withreference to FIG. 19, a study-specific UniView (as compared to asite-specific UniView) is selected. The selected study-specific UniView1920 “Visit Date” is connected to a dimension lower in the UniDimNethierarchy than “site.” As such, the selected UniView can be configuredto retrieve its data in different ways (because in the hierarchicalUniDimNet the “higher” UniDim “site” can be associated with one to many“lower” UniDims “visit date”). For example, the user can select to queryvisit dates for a particular visit (e.g., “visit 2”) or for multiplevisits with the multiple visits shown in columns or rows in the resultsarea 1640. In an embodiment, such different ways can be selected by theuser by right-clicking on the selected UniView and selecting the desired“way” from a menu of possible “ways.”

[0103] With reference to FIG. 19, in this example the user has selectedto query visit dates for a particular visit (“visit 2”). The result area1640 is extended to include the column “visit_data visit 2” 1910 whichcontains the results returned by the UniView discussed above. It will beappreciated that additional UniViews may be added, subtracted ormodified in the UniViewer in order to interactively query the datasources with different complex queries.

[0104] While UniViewer 150 has been described with reference to the GUIcreated by system 100, it will be appreciated that any appropriatesteps, methods or logic may be implemented by system 100 to facilitatethe GUI of UniViewer 150 and the results of any user query madetherewith. In an embodiment, while system 100 is running, the UniDimNetis loaded into memory (e.g., RAM). When a user initially selects areference (start) dimension for a new query, a “results set” object iscreated and attached to the selected dimension in the UniDimNet. The“results set” object may be any suitable object, including a collection,table or an array, and may be “zeroed out” (i.e., empty) upon creation.Upon dragging a UniView onto the results area, the results set objectmay be modified accordingly. For example, depending upon the dimensioncontext of the UniView, the result set object may be repositioned withinthe UniDimNet. In this example, if the UniView returns “visitinformation” and is dragged onto a “patient” dimension (as selected bythe user), the result set object will be moved to a “lower” level(assuming the hierarchy of the UniDimNet defines the UniDim “patient” asbeing higher than “visits,” which will occur if “patient” is defined ashaving one to many “visits”). As the UniView is dragged onto the resultsarea, columns (e.g., 1710, 1720 and 1730 with reference to FIG. 17) areadded to the results set object (exemplifying a complex query). Thecolumns are defined by the returned “result” from each UniView draggedonto the results field. To provide the information returned from theUniView query (after a dimension instance is subsequently selected bythe user) into the results set object (and thus displayed to the user bythe GUI), the columns of the results set object are mapped directly tothe columns of the table which contains the results of the UniViewquery. The resulting data in the results set object is displayed in theresults area of the GUI.

[0105] It may be desirable during use of system 100 to add an additionaldata source or data sources to system 100. In this event, system 100 ismodified in any suitable manner to accommodate the addition of such newdata source(s). In an embodiment, with reference to FIGS. 20A and 20B,an exemplary procedure for incorporating addition data source(s) isillustrated. At 2005, the UniServer is shut down. At 2010, adetermination is made as to whether the addition of the new datasource(s) necessitates the addition of any new industry business contextdimension(s) to the UniDimNet. If additional dimensions are necessary orotherwise deemed desirable, the UniDimNet is extended at 2015 to includea new UniDim for each new dimension, and to revise any existing UniDimsaccordingly. After adding the new UniDim(s) (or if no new UniDim(s) areadded), at 2020 a DataSourceDim is created for each data sourcedimension in the additional data source(s) and all new DataSourceDimsare linked accordingly into the UniDimNet. At 2025 a new data sourceclass is optionally created, particularly if assistance from theUniBuilder in creating UniViews (e.g., with UniBot) is desired. At 2030a rule book and update workflow is created and stored in system 100. At2035 it is optionally determined whether the new data source(s) requirea new class of data access mechanisms. If so, a new UniView class iscreated to encapsulate the access mechanisms required to query the datasource(s), particularly if assistance from the UniBuilder in creatingUniViews is desired, and the data source class created in 2025 ismodified in accordance with the new UniView class.

[0106] After creation of the new UniView class (or if such class was notcreated), at 2045 the UniServer is restarted. At 2050 UniViews for thenew data source(s) are created by any suitable means, including manuallyor with assistance from the UniBuilder (particularly the UniBot if newUniView classes have been created). At 2055 the new UniViews areincluded in the UniView Tree. At 2060 the new data source instances areregistered with system 100, including registration and storage of allrequired information regarding each instance in the UniDimNet. TheNotifier will react to the new instances (not shown) to furtherassimilate the new instances into system 100.

[0107] It will be appreciated that security regarding access to and useof a system 100 can be effectuated by any suitable mechanism. In anembodiment, the complete set of UniViews available to a user is definedby the access rights of the user. Each user may have a strictly definedset of UniViews which that user can view and/or access. In this sense,access to data sources (and certain data therein) can be controlled(e.g., if a UniView to a particular piece of data does not exist, thedata will not be returned to the user; furthermore, if no UniViews to aparticular data source are accessible to a user, the user will not beable to access the data source). Furthermore, access to dimensioninstances may also be restricted via user-specific access rights (i.e.,a user can be prohibited from receiving data regarding a particularinstance if the user is not given access rights to that instance).

[0108] With reference to FIG. 21, an embodiment of a method for unifyingdata of the present invention is shown. In this embodiment, at step 2110a plurality of industry business context dimensions is defined. Anyappropriate dimensions may be defined for an industry, including but notlimited to as previously described herein. At step 2120 a plurality ofdata source specific dimensions is defined for each data source whichmay be queried. Appropriate data source specific dimensions may bedefined, including but not limited to as previously described herein. Atstep 2130 a database including representations of the defined dimensionsand data source specific dimensions (e.g., a UniDimNet) is provided. Anyappropriate database may be provided, including but not limited to aUniDimNet as previously described herein. At step 2140 a plurality ofqueries adapted for each data source (e.g., a UniView) is provided. Anyappropriate queries may be provided, including but not limited toUniViews, UniViewInterfaces and CompoundUniViews as previously describedherein. Each query is provided to access a data source using a dataaccess mechanism which facilitates access to the data source. At step2150 at least one of the data sources is queried by using at least oneof the provided queries. For example, as previously described herein, anUniView is used to query a data source associated with the UniView. Atstep 2160 result(s) of the query are provided to a user. Any appropriatesteps for providing such result(s) to the user may be used, includingbut not limited to use of a GUI provided by a UniViewer as previouslydescribed herein.

[0109] It will be appreciated that the method described above mayinclude any additional appropriate steps and that each step describedabove may comprise additional substeps. For example, with reference toFIG. 22, step 2120 optionally includes steps 2210-2250. At step 2210, anobject in the database is created to represent each industry businesscontext dimension. Any appropriate object may be used, including aUniDim in a table format as previously described herein. At step 2220,each object is related to at least one other object. Objects may berelated in any suitable manner, including a hierarchical relationship ofUniDims as previously described herein. At step 2230, on object in thedatabase is created to represent each data source specific dimension.Any appropriate object may be used, including a DataSourceDim in a tableformat as previously described herein. At step 2240, each data sourcespecific dimension object is related to at least one industry businesscontext dimension object. Data source specific dimension objects may berelated to industry business context dimension objects in any suitablemanner, including as previously described herein. At step 2250, theobjects in the database are populated with relevant information. Anyrelevant information may be stored in the objects in any suitablemanner. For example, as previously described herein, the UniDims and theDataSourceDims are populated with unique identifications for eachdimension instance.

[0110] Those skilled in the art will appreciate that the invention maybe realized without utilizing all the above-described steps of theexemplary embodiment, nor must the steps be carried out in the describedorder.

[0111] The invention has been described with reference to the preferredembodiments. Modifications and alterations will occur to others upon areading and understanding of this specification. It is intended toinclude all such modifications and alterations insofar as they comewithin the scope of the appended claims or the equivalents thereof.

We claim:
 1. A system for unifying data relating to an industry having a plurality of industry business context dimensions which define logical groupings of data related to the industry, the system comprising: a plurality of data sources, at least one data source having a physical or logical structure differing from at least one other data source, each data source having data which is capable of a logical contextual grouping into at least one data source specific dimension which contains data related to at least one industry business context dimension, and each data source having a data access mechanism for facilitating querying thereof; a database having a first and a second plurality of nodes, each of the first plurality of nodes representing an industry business context dimension, each of the second plurality of nodes representing a data source specific dimension of at least one of the data sources, each of the first plurality of nodes related to at least one other of the first plurality of nodes, and each of the second plurality of nodes related to at least one of the first plurality of nodes; and a plurality of data source query function calls, each query function call querying a single data source regarding a single data source specific dimension, and each query function call using the data access mechanism of the single data source to facilitate access to the single data source.
 2. The system of claim 1 wherein each dimension has at least one dimension instance and wherein at least two of the data sources each has a physical or logical structure different from the other and each of the at least two data sources has data relating to a dimension instance, the system further comprising: at least one complex query, the complex query comprising a plurality of data source query function calls, the complex query querying the at least two data sources for data relating to the dimension instance, the complex query calling the plurality of data source query function calls to perform the querying of the at least two data sources for the data relating to the dimension instance, and wherein the data relating to the dimension instance is retrieved from each of the at least two data sources.
 3. The system of claim 1 wherein each dimension has at least one dimension instance, the system further comprising: at least one result set object populated by data returned from a query from a user, wherein the query from the user includes selection of at least one dimension instance and at least one query function call without identification by the user of which data source to query.
 4. The system of claim 1, further comprising: at least one complex query calling a plurality of query function calls to query the plurality of data sources, wherein the one complex query does not identify any data source to query.
 5. The system of claim 1, further comprising: at least one complex query for data located in a plurality of data sources, the complex query calling a plurality of query function calls to query the plurality of data sources for the data, the complex query having a set of input parameters which define the data to be queried for, the set of input parameters consisting of at least one dimension instance, a query result and a description of the data to be queried. 6 The system of claim 5 wherein the description of the data to be queried is an exact_request_for_information.
 7. A system for managing data relating to an industry having a plurality of industry business context dimensions which define logical groupings of data related to the industry, the data contained in a plurality of data sources, at least one data source having a physical or logical structure differing from at least one other data source, each data source having data which is capable of a logical contextual grouping into at least one data source specific dimension which contains data related to at least one industry business context dimension, and each data source having a data access mechanism for facilitating querying thereof, the system comprising: a UniDimNet and a plurality of UniViews.
 8. The system of claim 7, wherein the UniDimNet further comprises: a plurality of UniDims, each UniDim representing an industry business context dimension, each UniDim related to at least one other UniDim; and a plurality of DataSourceDims, each DataSourceDim representing a data source specific dimension of a data source, and each DataSourceDim related to at least one UniDim.
 9. The system of claim 8, wherein each UniDim and each DataSourceDim is a node in a network which is contained in a database.
 10. The system of claim 9, wherein each node is a table.
 11. The system of claim 9, wherein each dimension has at least one dimension instance, and each dimension instance has a unique identification, wherein: each UniDim table contains the unique identification of each dimension instance of the dimension to which the UniDim relates.
 12. The system of claim 7, wherein each UniView is a query function call which queries a single data source regarding a single data source specific dimension by using the data access mechanism of the data source.
 13. The system of claim 12, further comprising at least one complex query.
 14. The system of claim 13, the complex query having a set of input parameters, the set of input parameters not identifying a data source.
 15. The system of claim 7 further comprising a UniViewer.
 16. A method for managing data relating to an industry having a plurality of industry business context dimensions which define logical groupings of data related to the industry, the data contained in a plurality of data sources, at least one data source having a physical or logical structure differing from at least one other data source, each data source having data which is capable of a logical contextual grouping into at least one data source specific dimension which contains data related to at least one industry business context dimension, and each data source having a data access mechanism for facilitating querying thereof, the method comprising the steps of: identifying a plurality of industry business context dimensions; identifying at least one data source specific dimension for each data source; providing a UniDimNet; providing a plurality of UniViews; formulating a complex query, the complex query using the UniDimNet to assist in calling at least one UniView to query at least one data source; and providing the results of the query to a user.
 17. The method of claim 16, the providing a UniDimNet step further comprising the steps of: creating a UniDim for each industry business context dimension and relating each UniDim to at least one other UniDim.
 18. The method of claim 17, the providing a UniDimNet step further comprising the steps of: creating a DataSourceDim for each data source specific dimension of each data source and relating each DataSourceDim to at least one UniDim.
 19. The method of claim 18 wherein each dimension has at least one dimension instance, the providing a UniDimNet step further comprising the steps of: populating each UniDim and each DataSourceDim in the UniDimNet with data relating to each dimension instance.
 20. A method for querying data relating to an industry having a plurality of industry business context dimensions which define logical groupings of data related to the industry, the data contained in a plurality of data sources, at least one data source having a physical or logical structure differing from at least one other data source, each data source having data which is capable of a logical contextual grouping into at least one data source specific dimension which contains data related to at least one industry business context dimension, each data source having a data access mechanism for facilitating querying thereof, the data sources being part of a system for unifying the data, the system having a plurality of data source query function calls, each query function call querying a single data source regarding a single data source specific dimension, each dimension having at least one dimension instance, the method comprising the steps of: receiving from a user the identity of a dimension to be queried; providing to the user a plurality of data source query function calls from which the user may select at least one data source query function call; creating a result set having columns defined by the data source query functions selected by the user; receiving from a user the identity of at least one dimension instance to perform a query regarding; and populating the columns of the result set with data retrieved from the query.
 21. The method of claim 20, further comprising the step of: providing to the user a list of dimension instances available for the selected dimension.
 22. The method of claim 20, further comprising the step of: modifying the result field based upon a change by the user to data source query function calls selected.
 23. The method of claim 20, further comprising the step of: modifying the result field based upon a change by the user to the dimension instances selected. 